An articulation model for audiovisual speech synthesis - Determination, adjustment, evaluation
نویسندگان
چکیده
The authors present a visual articulation model for speech synthesis and a method to obtain it from measured data. This visual articulation model is integrated into MASSY, the Modular Audiovisual Speech SYnthesizer, and used to control visible articulator movements described by six motion parameters: one for the up-down movement of the lower jaw, three for the lips and two for the tongue. The visual articulation model implements the dominance principle as suggested by Löfqvist (1990). The parameter values for the model derive from measured articulator positions. To obtain these data, the articulation movements of a female speaker were measured with the 2D-articulograph AG100 and simultaneously filmed. The visual articulation model is adjusted and evaluated by testing word recognition in noise.
منابع مشابه
An Expandable W Audiovisual Text-to-Speech
The authors propose a framework for audiovisual speech synthesis systems [1] and present a first implementation of the framework [2], which is called MASSY Modular Audiovisual Speech SYnthesizer. This paper describes how the audiovisual speech synthesis system, the ‘talking head’, works, how it can be integrated into web-applications, and why it is worthwhile using it. The presented application...
متن کاملAn expandable web-based audiovisual text-to-speech synthesis system
The authors propose a framework for audiovisual speech synthesis systems [1] and present a first implementation of the framework [2], which is called MASSY Modular Audiovisual Speech SYnthesizer. This paper describes how the audiovisual speech synthesis system, the ‘talking head’, works, how it can be integrated into web-applications, and why it is worthwhile using it. The presented application...
متن کاملTwo articulation models for audiovisual speech synthesis - description and determination
The authors present two visual articulation models for speech synthesis and methods to obtain them from measured data. The visual articulation models are used to control visible articulator movements described by six motion parameters: one for the up-down movement of the lower jaw, three for the lips and two for the tongue (see section 2.1 for details). To obtain the data, a female speaker was ...
متن کاملMASSY - a Prototypic Implementation of the Modular Audiovisual Speech SYnthesizer
Audiovisual speech synthesis systems usually are inflexible with respect to the ability to replace the audio and video synthesis and the control algorithms due to the dependencies of the implemented pieces. In order to enable a newly developed system to exchange modules, to evaluate their specific advantages, and to detect their weak points, the author proposes a framework for audiovisual speec...
متن کاملMerging methods of speech visualization
The author presents MASSY, the MODULAR AUDIOVISUAL SPEECH SYNTHESIZER. The system combines two approaches of visual speech synthesis. Two control models are implemented: a (data based) di-viseme model and a (rule based) dominance model where both produce control commands in a parameterized articulation space. Analogously two visualization methods are implemented: an image based (video-realistic...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Speech Communication
دوره 44 شماره
صفحات -
تاریخ انتشار 2004